Large Scale Text Analysis

نویسندگان

  • Kevin Tee
  • Xinchen Ye
  • Weijia Jin
  • WEIJIA JIN
  • Johanna Ye
چکیده

We take an algorithmic and computational approach to the problem of providing patent recommendations, developing a web interface that allows users to upload their draft patent and returns a list of ranked relevant patents in real time. We develop scalable, distributed algorithms based on optimization techniques and sparse machine learning, with a focus on both accuracy and speed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Turning Quantitative: An Analytic Scale to Do Critical Discourse Analysis

Critical Discourse Analysis (CDA) could be seen as a theory in qualitative more than in qualitative stud- ies. This might have led to difficulty in doing CDA. Accordingly, this study attempted to develop a quan- titative profile in the form of an analytic rubric. For this purpose, Fairclough’s model of CDA was select- ed as the research framework. The techniques used for structuring analy...

متن کامل

Chapter 9 MINING TEXT STREAMS

The large amount of text data which are continuously produced over time in a variety of large scale applications such as social networks results in massive streams of data. Typically massive text streams are created by very large scale interactions of individuals, or by structured creations of particular kinds of content by dedicated organizations. An example in the latter category would be the...

متن کامل

A comparative study of the text inside the Mihrabi rug by Zareh Penyamin and Topkapi Palace Museum according to the existing discourse in the 16th and 19th

IIn the country of Turkey, in the city of Hereke, at the end of the 19th century, rugs known as Mihrabi became popular, which were inspired by the rugs of the Safavid era and kept in the Topkapi Palace Museum. In these rugs, which are reproduced in royal workshops on a large scale, some changes have been made in the verbal text and incorporated visual elements. Among the rugs that seem to have ...

متن کامل

Large Scale Corpus Analysis and Recent Applications

Recent progress of corpus and machine learning-based natural language processing methodologies have made it possible to handle large scale corpus with a quite high accuracy. The speaker is now involved in a project for constructing a large scale contemporary Japanese balanced corpus, aiming at constructing automatic annotation tools on various levels of natural language analyses. I will first i...

متن کامل

Automating the analysis of collaborative discourse: identifying idea clusters

This poster explores CSCL practices relating to the use of a tool that employs information visualization techniques and large-scale text processing and analysis to complement qualitative analysis of collaborative discourse. Results from latent semantic analysis and qualitative analysis of online discussion transcripts are compared. Findings suggest that such tools that automate analyses of larg...

متن کامل

Straightforward Feature Selection for Scalable Latent Semantic Indexing

Latent Semantic Indexing (LSI) has been validated to be effective on many small scale text collections. However, little evidence has shown its effectiveness on unsampled large scale text corpus due to its high computational complexity. In this paper, we propose a straightforward feature selection strategy, which is named as Feature Selection for Latent Semantic Indexing (FSLSI), as a preprocess...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015